Synthesis Speech Based Data Augmentation for Low Resource Children ASR
نویسندگان
چکیده
Successful speech recognition for children requires large training data with sufficient speaker variability. The collection of such a database children’s voices is challenging and very expensive zero/low resource language like Punjabi. In this paper, the scarcity issue low resourced Punjabi addressed through two levels augmentation. original corpus first augmented by modifying prosody parameters pitch speaking rate. Our results show that augmentation improves system performance over baseline system. Then combined used to train TTS generate synthesis extended dataset further generating utterances using text-to-speech sampling model methods increase acoustic lexical diversity. final indicates relative improvement 50.10% 57.40% diversity based in comparison respectively.
منابع مشابه
Data augmentation for low resource languages
Recently there has been interest in the approaches for training speech recognition systems for languages with limited resources. Under the IARPA Babel program such resources have been provided for a range of languages to support this research area. This paper examines a particular form of approach, data augmentation, that can be applied to these situations. Data augmentation schemes aim to incr...
متن کاملData Augmentation for Low-Resource Neural Machine Translation
The quality of a Neural Machine Translation system depends substantially on the availability of sizable parallel corpora. For low-resource language pairs this is not the case, resulting in poor translation quality. Inspired by work in computer vision, we propose a novel data augmentation approach that targets low-frequency words by generating new sentence pairs containing rare words in new, syn...
متن کاملTraining Data Augmentation for Low-Resource Morphological Inflection
This work describes the UoE-LMU submission for the CoNLL-SIGMORPHON 2017 Shared Task on Universal Morphological Reinflection, Subtask 1: given a lemma and target morphological tags, generate the target inflected form. We evaluate several ways to improve performance in the 1000-example setting: three methods to augment the training data with identical input-output pairs (i.e., autoencoding), a h...
متن کاملData augmentation, feature combination, and multilingual neural networks to improve ASR and KWS performance for low-resource languages
This paper presents the progress of acoustic models for lowresourced languages (Assamese, Bengali, Haitian Creole, Lao, Zulu) developed within the second evaluation campaign of the IARPA Babel project. This year, the main focus of the project is put on training high-performing automatic speech recognition (ASR) and keyword search (KWS) systems from language resources limited to about 10 hours o...
متن کاملTwo-Stage Data Augmentation for Low-Resourced Speech Recognition
Low resourced languages suffer from limited training data and resources. Data augmentation is a common approach to increasing the amount of training data. Additional data is synthesized by manipulating the original data with a variety of methods. Unlike most previous work that focuses on a single technique, we combine multiple, complementary augmentation approaches. The first stage adds noise a...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Lecture Notes in Computer Science
سال: 2021
ISSN: ['1611-3349', '0302-9743']
DOI: https://doi.org/10.1007/978-3-030-87802-3_29